<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    
    <title>20.8. xml.dom.pulldom — Support for building partial DOM trees &mdash; Python 3.4.3 documentation</title>
    
    <link rel="stylesheet" href="../_static/pydoctheme.css" type="text/css" />
    <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
    
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    '../',
        VERSION:     '3.4.3',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  true
      };
    </script>
    <script type="text/javascript" src="../_static/jquery.js"></script>
    <script type="text/javascript" src="../_static/underscore.js"></script>
    <script type="text/javascript" src="../_static/doctools.js"></script>
    <script type="text/javascript" src="../_static/sidebar.js"></script>
    <link rel="search" type="application/opensearchdescription+xml"
          title="Search within Python 3.4.3 documentation"
          href="../_static/opensearch.xml"/>
    <link rel="author" title="About these documents" href="../about.html" />
    <link rel="copyright" title="Copyright" href="../copyright.html" />
    <link rel="top" title="Python 3.4.3 documentation" href="../index.html" />
    <link rel="up" title="20. Structured Markup Processing Tools" href="markup.html" />
    <link rel="next" title="20.9. xml.sax — Support for SAX2 parsers" href="xml.sax.html" />
    <link rel="prev" title="20.7. xml.dom.minidom — Minimal DOM implementation" href="xml.dom.minidom.html" />
    <link rel="shortcut icon" type="image/png" href="../_static/py.png" />
    <script type="text/javascript" src="../_static/copybutton.js"></script>
    <script type="text/javascript" src="../_static/version_switch.js"></script>
    
 

  </head>
  <body>  
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="../genindex.html" title="General Index"
             accesskey="I">index</a></li>
        <li class="right" >
          <a href="../py-modindex.html" title="Python Module Index"
             >modules</a> |</li>
        <li class="right" >
          <a href="xml.sax.html" title="20.9. xml.sax — Support for SAX2 parsers"
             accesskey="N">next</a> |</li>
        <li class="right" >
          <a href="xml.dom.minidom.html" title="20.7. xml.dom.minidom — Minimal DOM implementation"
             accesskey="P">previous</a> |</li>
        <li><img src="../_static/py.png" alt=""
                 style="vertical-align: middle; margin-top: -1px"/></li>
        <li><a href="https://www.python.org/">Python</a> &raquo;</li>
        <li>
          <span class="version_switcher_placeholder">3.4.3</span>
          <a href="../index.html">Documentation</a> &raquo;
        </li>

          <li><a href="index.html" >The Python Standard Library</a> &raquo;</li>
          <li><a href="markup.html" accesskey="U">20. Structured Markup Processing Tools</a> &raquo;</li> 
      </ul>
    </div>    

    <div class="document">
      <div class="documentwrapper">
        <div class="bodywrapper">
          <div class="body">
            
  <div class="section" id="module-xml.dom.pulldom">
<span id="xml-dom-pulldom-support-for-building-partial-dom-trees"></span><h1>20.8. <a class="reference internal" href="#module-xml.dom.pulldom" title="xml.dom.pulldom: Support for building partial DOM trees from SAX events."><tt class="xref py py-mod docutils literal"><span class="pre">xml.dom.pulldom</span></tt></a> &#8212; Support for building partial DOM trees<a class="headerlink" href="#module-xml.dom.pulldom" title="Permalink to this headline">¶</a></h1>
<p><strong>Source code:</strong> <a class="reference external" href="https://hg.python.org/cpython/file/3.4/Lib/xml/dom/pulldom.py">Lib/xml/dom/pulldom.py</a></p>
<hr class="docutils" />
<p>The <a class="reference internal" href="#module-xml.dom.pulldom" title="xml.dom.pulldom: Support for building partial DOM trees from SAX events."><tt class="xref py py-mod docutils literal"><span class="pre">xml.dom.pulldom</span></tt></a> module provides a &#8220;pull parser&#8221; which can also be
asked to produce DOM-accessible fragments of the document where necessary. The
basic concept involves pulling &#8220;events&#8221; from a stream of incoming XML and
processing them. In contrast to SAX which also employs an event-driven
processing model together with callbacks, the user of a pull parser is
responsible for explicitly pulling events from the stream, looping over those
events until either processing is finished or an error condition occurs.</p>
<div class="admonition warning">
<p class="first admonition-title">Warning</p>
<p class="last">The <a class="reference internal" href="#module-xml.dom.pulldom" title="xml.dom.pulldom: Support for building partial DOM trees from SAX events."><tt class="xref py py-mod docutils literal"><span class="pre">xml.dom.pulldom</span></tt></a> module is not secure against
maliciously constructed data.  If you need to parse untrusted or
unauthenticated data see <a class="reference internal" href="xml.html#xml-vulnerabilities"><em>XML vulnerabilities</em></a>.</p>
</div>
<p>Example:</p>
<div class="highlight-python3"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">xml.dom</span> <span class="k">import</span> <span class="n">pulldom</span>

<span class="n">doc</span> <span class="o">=</span> <span class="n">pulldom</span><span class="o">.</span><span class="n">parse</span><span class="p">(</span><span class="s">&#39;sales_items.xml&#39;</span><span class="p">)</span>
<span class="k">for</span> <span class="n">event</span><span class="p">,</span> <span class="n">node</span> <span class="ow">in</span> <span class="n">doc</span><span class="p">:</span>
    <span class="k">if</span> <span class="n">event</span> <span class="o">==</span> <span class="n">pulldom</span><span class="o">.</span><span class="n">START_ELEMENT</span> <span class="ow">and</span> <span class="n">node</span><span class="o">.</span><span class="n">tagName</span> <span class="o">==</span> <span class="s">&#39;item&#39;</span><span class="p">:</span>
        <span class="k">if</span> <span class="nb">int</span><span class="p">(</span><span class="n">node</span><span class="o">.</span><span class="n">getAttribute</span><span class="p">(</span><span class="s">&#39;price&#39;</span><span class="p">))</span> <span class="o">&gt;</span> <span class="mi">50</span><span class="p">:</span>
            <span class="n">doc</span><span class="o">.</span><span class="n">expandNode</span><span class="p">(</span><span class="n">node</span><span class="p">)</span>
            <span class="nb">print</span><span class="p">(</span><span class="n">node</span><span class="o">.</span><span class="n">toxml</span><span class="p">())</span>
</pre></div>
</div>
<p><tt class="docutils literal"><span class="pre">event</span></tt> is a constant and can be one of:</p>
<ul class="simple">
<li><tt class="xref py py-data docutils literal"><span class="pre">START_ELEMENT</span></tt></li>
<li><tt class="xref py py-data docutils literal"><span class="pre">END_ELEMENT</span></tt></li>
<li><tt class="xref py py-data docutils literal"><span class="pre">COMMENT</span></tt></li>
<li><tt class="xref py py-data docutils literal"><span class="pre">START_DOCUMENT</span></tt></li>
<li><tt class="xref py py-data docutils literal"><span class="pre">END_DOCUMENT</span></tt></li>
<li><tt class="xref py py-data docutils literal"><span class="pre">CHARACTERS</span></tt></li>
<li><tt class="xref py py-data docutils literal"><span class="pre">PROCESSING_INSTRUCTION</span></tt></li>
<li><tt class="xref py py-data docutils literal"><span class="pre">IGNORABLE_WHITESPACE</span></tt></li>
</ul>
<p><tt class="docutils literal"><span class="pre">node</span></tt> is a object of type <tt class="xref py py-class docutils literal"><span class="pre">xml.dom.minidom.Document</span></tt>,
<tt class="xref py py-class docutils literal"><span class="pre">xml.dom.minidom.Element</span></tt> or <tt class="xref py py-class docutils literal"><span class="pre">xml.dom.minidom.Text</span></tt>.</p>
<p>Since the document is treated as a &#8220;flat&#8221; stream of events, the document &#8220;tree&#8221;
is implicitly traversed and the desired elements are found regardless of their
depth in the tree. In other words, one does not need to consider hierarchical
issues such as recursive searching of the document nodes, although if the
context of elements were important, one would either need to maintain some
context-related state (i.e. remembering where one is in the document at any
given point) or to make use of the <a class="reference internal" href="#xml.dom.pulldom.DOMEventStream.expandNode" title="xml.dom.pulldom.DOMEventStream.expandNode"><tt class="xref py py-func docutils literal"><span class="pre">DOMEventStream.expandNode()</span></tt></a> method
and switch to DOM-related processing.</p>
<dl class="class">
<dt id="xml.dom.pulldom.PullDom">
<em class="property">class </em><tt class="descclassname">xml.dom.pulldom.</tt><tt class="descname">PullDom</tt><big>(</big><em>documentFactory=None</em><big>)</big><a class="headerlink" href="#xml.dom.pulldom.PullDom" title="Permalink to this definition">¶</a></dt>
<dd><p>Subclass of <a class="reference internal" href="xml.sax.handler.html#xml.sax.handler.ContentHandler" title="xml.sax.handler.ContentHandler"><tt class="xref py py-class docutils literal"><span class="pre">xml.sax.handler.ContentHandler</span></tt></a>.</p>
</dd></dl>

<dl class="class">
<dt id="xml.dom.pulldom.SAX2DOM">
<em class="property">class </em><tt class="descclassname">xml.dom.pulldom.</tt><tt class="descname">SAX2DOM</tt><big>(</big><em>documentFactory=None</em><big>)</big><a class="headerlink" href="#xml.dom.pulldom.SAX2DOM" title="Permalink to this definition">¶</a></dt>
<dd><p>Subclass of <a class="reference internal" href="xml.sax.handler.html#xml.sax.handler.ContentHandler" title="xml.sax.handler.ContentHandler"><tt class="xref py py-class docutils literal"><span class="pre">xml.sax.handler.ContentHandler</span></tt></a>.</p>
</dd></dl>

<dl class="function">
<dt id="xml.dom.pulldom.parse">
<tt class="descclassname">xml.dom.pulldom.</tt><tt class="descname">parse</tt><big>(</big><em>stream_or_string</em>, <em>parser=None</em>, <em>bufsize=None</em><big>)</big><a class="headerlink" href="#xml.dom.pulldom.parse" title="Permalink to this definition">¶</a></dt>
<dd><p>Return a <a class="reference internal" href="#xml.dom.pulldom.DOMEventStream" title="xml.dom.pulldom.DOMEventStream"><tt class="xref py py-class docutils literal"><span class="pre">DOMEventStream</span></tt></a> from the given input. <em>stream_or_string</em> may be
either a file name, or a file-like object. <em>parser</em>, if given, must be a
<a class="reference internal" href="xml.sax.reader.html#xml.sax.xmlreader.XMLReader" title="xml.sax.xmlreader.XMLReader"><tt class="xref py py-class docutils literal"><span class="pre">XMLReader</span></tt></a> object. This function will change the
document handler of the
parser and activate namespace support; other parser configuration (like
setting an entity resolver) must have been done in advance.</p>
</dd></dl>

<p>If you have XML in a string, you can use the <a class="reference internal" href="#xml.dom.pulldom.parseString" title="xml.dom.pulldom.parseString"><tt class="xref py py-func docutils literal"><span class="pre">parseString()</span></tt></a> function instead:</p>
<dl class="function">
<dt id="xml.dom.pulldom.parseString">
<tt class="descclassname">xml.dom.pulldom.</tt><tt class="descname">parseString</tt><big>(</big><em>string</em>, <em>parser=None</em><big>)</big><a class="headerlink" href="#xml.dom.pulldom.parseString" title="Permalink to this definition">¶</a></dt>
<dd><p>Return a <a class="reference internal" href="#xml.dom.pulldom.DOMEventStream" title="xml.dom.pulldom.DOMEventStream"><tt class="xref py py-class docutils literal"><span class="pre">DOMEventStream</span></tt></a> that represents the (Unicode) <em>string</em>.</p>
</dd></dl>

<dl class="data">
<dt id="xml.dom.pulldom.default_bufsize">
<tt class="descclassname">xml.dom.pulldom.</tt><tt class="descname">default_bufsize</tt><a class="headerlink" href="#xml.dom.pulldom.default_bufsize" title="Permalink to this definition">¶</a></dt>
<dd><p>Default value for the <em>bufsize</em> parameter to <a class="reference internal" href="#xml.dom.pulldom.parse" title="xml.dom.pulldom.parse"><tt class="xref py py-func docutils literal"><span class="pre">parse()</span></tt></a>.</p>
<p>The value of this variable can be changed before calling <a class="reference internal" href="#xml.dom.pulldom.parse" title="xml.dom.pulldom.parse"><tt class="xref py py-func docutils literal"><span class="pre">parse()</span></tt></a> and
the new value will take effect.</p>
</dd></dl>

<div class="section" id="domeventstream-objects">
<span id="id1"></span><h2>20.8.1. DOMEventStream Objects<a class="headerlink" href="#domeventstream-objects" title="Permalink to this headline">¶</a></h2>
<dl class="class">
<dt id="xml.dom.pulldom.DOMEventStream">
<em class="property">class </em><tt class="descclassname">xml.dom.pulldom.</tt><tt class="descname">DOMEventStream</tt><big>(</big><em>stream</em>, <em>parser</em>, <em>bufsize</em><big>)</big><a class="headerlink" href="#xml.dom.pulldom.DOMEventStream" title="Permalink to this definition">¶</a></dt>
<dd><dl class="method">
<dt id="xml.dom.pulldom.DOMEventStream.getEvent">
<tt class="descname">getEvent</tt><big>(</big><big>)</big><a class="headerlink" href="#xml.dom.pulldom.DOMEventStream.getEvent" title="Permalink to this definition">¶</a></dt>
<dd><p>Return a tuple containing <em>event</em> and the current <em>node</em> as
<tt class="xref py py-class docutils literal"><span class="pre">xml.dom.minidom.Document</span></tt> if event equals <tt class="xref py py-data docutils literal"><span class="pre">START_DOCUMENT</span></tt>,
<tt class="xref py py-class docutils literal"><span class="pre">xml.dom.minidom.Element</span></tt> if event equals <tt class="xref py py-data docutils literal"><span class="pre">START_ELEMENT</span></tt> or
<tt class="xref py py-data docutils literal"><span class="pre">END_ELEMENT</span></tt> or <tt class="xref py py-class docutils literal"><span class="pre">xml.dom.minidom.Text</span></tt> if event equals
<tt class="xref py py-data docutils literal"><span class="pre">CHARACTERS</span></tt>.
The current node does not contain informations about its children, unless
<a class="reference internal" href="#xml.dom.pulldom.DOMEventStream.expandNode" title="xml.dom.pulldom.DOMEventStream.expandNode"><tt class="xref py py-func docutils literal"><span class="pre">expandNode()</span></tt></a> is called.</p>
</dd></dl>

<dl class="method">
<dt id="xml.dom.pulldom.DOMEventStream.expandNode">
<tt class="descname">expandNode</tt><big>(</big><em>node</em><big>)</big><a class="headerlink" href="#xml.dom.pulldom.DOMEventStream.expandNode" title="Permalink to this definition">¶</a></dt>
<dd><p>Expands all children of <em>node</em> into <em>node</em>. Example:</p>
<div class="highlight-python3"><div class="highlight"><pre><span class="n">xml</span> <span class="o">=</span> <span class="s">&#39;&lt;html&gt;&lt;title&gt;Foo&lt;/title&gt; &lt;p&gt;Some text &lt;div&gt;and more&lt;/div&gt;&lt;/p&gt; &lt;/html&gt;&#39;</span>
<span class="n">doc</span> <span class="o">=</span> <span class="n">pulldom</span><span class="o">.</span><span class="n">parseString</span><span class="p">(</span><span class="n">xml</span><span class="p">)</span>
<span class="k">for</span> <span class="n">event</span><span class="p">,</span> <span class="n">node</span> <span class="ow">in</span> <span class="n">doc</span><span class="p">:</span>
    <span class="k">if</span> <span class="n">event</span> <span class="o">==</span> <span class="n">pulldom</span><span class="o">.</span><span class="n">START_ELEMENT</span> <span class="ow">and</span> <span class="n">node</span><span class="o">.</span><span class="n">tagName</span> <span class="o">==</span> <span class="s">&#39;p&#39;</span><span class="p">:</span>
        <span class="c"># Following statement only prints &#39;&lt;p/&gt;&#39;</span>
        <span class="nb">print</span><span class="p">(</span><span class="n">node</span><span class="o">.</span><span class="n">toxml</span><span class="p">())</span>
        <span class="n">doc</span><span class="o">.</span><span class="n">exandNode</span><span class="p">(</span><span class="n">node</span><span class="p">)</span>
        <span class="c"># Following statement prints node with all its children &#39;&lt;p&gt;Some text &lt;div&gt;and more&lt;/div&gt;&lt;/p&gt;&#39;</span>
        <span class="nb">print</span><span class="p">(</span><span class="n">node</span><span class="o">.</span><span class="n">toxml</span><span class="p">())</span>
</pre></div>
</div>
</dd></dl>

<dl class="method">
<dt id="xml.dom.pulldom.DOMEventStream.reset">
<tt class="descname">reset</tt><big>(</big><big>)</big><a class="headerlink" href="#xml.dom.pulldom.DOMEventStream.reset" title="Permalink to this definition">¶</a></dt>
<dd></dd></dl>

</dd></dl>

</div>
</div>


          </div>
        </div>
      </div>
      <div class="sphinxsidebar">
        <div class="sphinxsidebarwrapper">
  <h3><a href="../contents.html">Table Of Contents</a></h3>
  <ul>
<li><a class="reference internal" href="#">20.8. <tt class="docutils literal"><span class="pre">xml.dom.pulldom</span></tt> &#8212; Support for building partial DOM trees</a><ul>
<li><a class="reference internal" href="#domeventstream-objects">20.8.1. DOMEventStream Objects</a></li>
</ul>
</li>
</ul>

  <h4>Previous topic</h4>
  <p class="topless"><a href="xml.dom.minidom.html"
                        title="previous chapter">20.7. <tt class="docutils literal"><span class="pre">xml.dom.minidom</span></tt> &#8212; Minimal DOM implementation</a></p>
  <h4>Next topic</h4>
  <p class="topless"><a href="xml.sax.html"
                        title="next chapter">20.9. <tt class="docutils literal"><span class="pre">xml.sax</span></tt> &#8212; Support for SAX2 parsers</a></p>
<h3>This Page</h3>
<ul class="this-page-menu">
  <li><a href="../bugs.html">Report a Bug</a></li>
  <li><a href="../_sources/library/xml.dom.pulldom.txt"
         rel="nofollow">Show Source</a></li>
</ul>

<div id="searchbox" style="display: none">
  <h3>Quick search</h3>
    <form class="search" action="../search.html" method="get">
      <input type="text" name="q" />
      <input type="submit" value="Go" />
      <input type="hidden" name="check_keywords" value="yes" />
      <input type="hidden" name="area" value="default" />
    </form>
    <p class="searchtip" style="font-size: 90%">
    Enter search terms or a module, class or function name.
    </p>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div>
      <div class="clearer"></div>
    </div>  
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="../genindex.html" title="General Index"
             >index</a></li>
        <li class="right" >
          <a href="../py-modindex.html" title="Python Module Index"
             >modules</a> |</li>
        <li class="right" >
          <a href="xml.sax.html" title="20.9. xml.sax — Support for SAX2 parsers"
             >next</a> |</li>
        <li class="right" >
          <a href="xml.dom.minidom.html" title="20.7. xml.dom.minidom — Minimal DOM implementation"
             >previous</a> |</li>
        <li><img src="../_static/py.png" alt=""
                 style="vertical-align: middle; margin-top: -1px"/></li>
        <li><a href="https://www.python.org/">Python</a> &raquo;</li>
        <li>
          <span class="version_switcher_placeholder">3.4.3</span>
          <a href="../index.html">Documentation</a> &raquo;
        </li>

          <li><a href="index.html" >The Python Standard Library</a> &raquo;</li>
          <li><a href="markup.html" >20. Structured Markup Processing Tools</a> &raquo;</li> 
      </ul>
    </div>  
    <div class="footer">
    &copy; <a href="../copyright.html">Copyright</a> 1990-2015, Python Software Foundation.
    <br />
    The Python Software Foundation is a non-profit corporation.
    <a href="https://www.python.org/psf/donations/">Please donate.</a>
    <br />
    Last updated on Feb 26, 2015.
    <a href="../bugs.html">Found a bug</a>?
    <br />
    Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 1.2.3.
    </div>

  </body>
</html>